Efficient Query Processing for Multi-Dimensionally Clustered Tables in DB2

نویسندگان

  • Bishwaranjan Bhattacharjee
  • Sriram Padmanabhan
  • Timothy Malkemus
  • Tony Lai
  • Leslie Cranston
  • Matthew Huras
چکیده

We have introduced a Multi-Dimensional Clustering (MDC) physical layout scheme in DB2 version 8.0 for relational tables. MultiDimensional Clustering is based on the definition of one or more orthogonal clustering attributes (or expressions) of a table. The table is organized physically by associating records with similar values for the dimension attributes in a cluster. Each clustering key is allocated one or more blocks of physical storage with the aim of storing the multiple records belonging to the cluster in almost contiguous fashion. Block oriented indexes are created to access these blocks. In this paper, we describe novel techniques for query processing operations that provide significant performance improvements for MDC tables. Current database systems employ a repertoire of access methods including table scans, index scans, index ANDing, and index ORing. We have extended these access methods for efficiently processing the block based MDC tables. One important concept at the core of processing MDC tables is the block oriented access technique. In addition, since MDC tables can include regular record oriented indexes, we employ novel techniques to combine block and record indexes. Block oriented processing is extended to nested loop joins and Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 29th VLDB Conference, Berlin, Germany, 2003 star joins as well. We show results from experiments using a star-schema database to validate our claims of performance with minimal overhead.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Bulk Deletes for Multi Dimensionally Clustered Tables in DB2

In data warehousing applications, the ability to efficiently delete large chunks of data from a table is very important. This feature is also known as Rollout or Bulk Deletes. Rollout is generally carried out periodically and is often done on more than one dimension or attribute. The ability to efficiently handle the updates of RID indexes while doing Rollouts is a well known problem for databa...

متن کامل

Performance Study of Rollout for Multi Dimensional Clustered Tables in DB2

In data warehousing applications, the ability to efficiently delete large chunks of data from a table is very important. This feature is also known as Rollout. Rollout is generally carried out periodically and is often done on more than one dimension or attribute. DB2 UDB V8.1 introduced a new physical clustering scheme called Multi Dimensional Clustering (MDC) which allows users to cluster dat...

متن کامل

DB2 with BLU Acceleration: So Much More than Just a Column Store

DB2 with BLU Acceleration deeply integrates innovative new techniques for defining and processing column-organized tables that speed read-mostly Business Intelligence queries by 10 to 50 times and improve compression by 3 to 10 times, compared to traditional row-organized tables, without the complexity of defining indexes or materialized views on those tables. But DB2 BLU is much more than just...

متن کامل

Pathfinder Meets DB2 Relational XQuery Optimization Techniques

We are taking the next big step towards the goal of a purely relational XQuery implementation. The Pathfinder XQuery compiler has been enhanced by a code generator that emits SQL. This code generator targets off-the-shelf relational database systems (e.g., DB2) and turns them into efficient and scalable XQuery processors. Our approach neither depends on modifications of the database kernel, nor...

متن کامل

Demonstrating Near Real-Time Analytics with IBM DB2 Analytics Accelerator

Version 3 of the IBM DB2 Analytics Accelerator (IDAA) takes a major step towards the vision of a universal relational DBMS that transparently processes both, OLTP and analytical-type queries in a single system. Based on heuristics in DB2 for z/OS, the DB2 optimizer decides if a query should be executed by ”mainline” DB2 or if it is beneficial to forward it to the attached IBM DB2 Analytics Opti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003